arXiv:1506.09140vl [cs.FL] 30Jun2015 


Pure Strategies in Imperfect Information Stochastic Games 


Arnaud Carayog, Christof Loding^ and Olivier Serre^ 

^LIGM (CNRS & Universite Paris Est) 
^Informatik 7, RWTH Aachen 
^LIAFA (CNRS & Universite Paris Diderot - Paris 7) 

July 1, 2015 


Abstract 

We consider imperfect information stochastic games where we require the players to 
use pure (ie. non randomised) strategies. We consider reachability, safety, Biichi and co- 
Biichi objectives, and investigate the existence of almost-sure/positively winning strategies 
for the first player when the second player is perfectly informed or more informed than the 
first player. We obtain decidability results for positive reachability and almost-sure Biichi 
with optimal algorithms to decide existence of a pure winning strategy and to compute 
one if exists. We complete the picture by showing that positive safety is undecidable when 
restricting to pure strategies even if the second player is perfectly informed. 
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1 Introduction 


The study of two-player games has received a lot of attention in the last decade, mainly 
motivated by applications to the verification of reactive open systems. Those systems are 
composed of a program (represented by the first player, Eve) and some (possibly hostile) 
environment (represented by the second player, Adam). The verification problem consists in 
deciding whether the program can be restricted so that the system meets some given speci¬ 
fication whatever the environment does. Here, restricting the program means synthesizing a 
controller [16], which, in terms of games, is equivalent to designing a strategy for Eve that is 
winning against any strategy of Adam. 

Of course, the class of games to consider depends on the class of systems that one intends 
to model. This may lead to consider various features such as concurrency (the players in¬ 
dependently and simultaneously choose their action, whose parallel execution determines the 
next state), stochastic transitions (the next state is chosen according to a probability distri¬ 
bution depending on the current state and on the actions chosen by the players) or imperfect 
information (the players do not observe the exact state). Note that imperfect information is 
necessary if one wants for instance to model a system where the program and the environment 
share some public variables while having also their own private variables m- 

Recently in mm two (mainly equivalent) models of concurrent stochastic games with 
imperfect information have been introduced. They permit to capture several known models 
(as those from mum) while preserving the main decidability results. 

In this paper we consider the games as introduced in mm (we use the formalism of m)- 
These are finite state games in which, at each round, the two players choose concurrently 
an action and based on these actions the successor state is chosen according to some fixed 
probability distribution. The resulting infinite play is won by Eve if it satisfies a given objective. 
The objectives we consider here are reachability (is there a final state eventually visited?), 
safety (are forbidden states never visited?), Biichi (is there a final state that is visited infinitely 
often?) and co-Biichi (are forbidden states finitely often visited?). Imperfect information is 
modelled as follows: both players have an equivalence relation over states and, instead of 
observing the exact state, they only observe its equivalence class. 

In [13( [2| the authors were considering general strategies where a player is allowed to use 
randomisation when choosing her/his next action. It was then shown, for Biichi objectives, 
that one can decide whether Eve has such a strategy yp that is almost-surely winning against 
any strategy tp of Adam (meaning that an infinite play played according to (p and tp is won 
by Eve with probability 1). It was also established in [2] that one can decide for co-Biichi 
objectives whether Eve has a positively winning strategy. 

In the present work we restrict our attention to pure strategies, i.e. we forbid the players 
to randomise when choosing their actions. Our initial motivation for this work comes from 
automata theory. The emptiness problem for automata on infinite trees can be described as 
the problem of computing a winning strategy in a two-player game of infinite duration. The 
required game model depends on the class of automata that is considered. In particular, uni 
proposes a reduction of the emptiness problem for alternating tree automata to the existence 
of a pure winning strategy for Eve in an imperfect information game. For capturing the 
automaton model with a qualitative acceptance condition as introduced in [3| , one furthermore 
needs stochastic games (and up to now this is the only known method for checking emptiness 
of such automata). So one of our aims is to obtain a toolbox and to understand the limits of 
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this method for checking emptiness of tree automata. 

Our main results are the following. 

• On the negative side, by a reduction of the value 1 problem for probabilistic word 
automata HD, we prove that even if Adam is fully informed and Eve is totally blind 
[i.e. all states are indistinguishable for her), it is undecidable whether Eve can positively 
win a safety game (Section [3D . Under the same restrictions, positive winning in Biichi 
games and almost-sure winning in co-Biichi games are proved to be undecidable by 
reduction from the emptiness problem for classes of probabilistic w-word automata [I]. 

• To obtain positive results, we have to impose restrictions on how Adam is informed. 
We consider the case where he has perfect information and the case where he is more 
informed than Ev^. In both situations we show that it is decidable whether Eve has a 
positively winning pure strategy in a reachability game (Section UD- Using this result in 
a fixpoint computation, we prove that one can decide whether Eve has an almost-surely 
winning pure strategy in a Biichi game (Section [5D. Moreover, if exists, such a strategy 
can be constructed and requires finite memory. In both cases, we obtain matching upper 
and lower complexity bounds. 

The decidability results for the special case where Adam is perfectly informed were also 
obtained in [6j. However, the technique we develop here is different and in particular uses the 
positive winning case as a toolbox, which later permits us to handle the more general case 
where Adam is more informed than Eve. And while [6] focuses on reachability conditions 
and studies the memory required for winning strategies depending on how the players are 
informed, we focus on the case in which Adam is better informed than Eve (or even perfectly 
informed), and study different winning conditions. 

In Section [2] we introduce the basic concepts. In Section [3| we present the undecidability 
results. In Section |3| we address the decidability of whether Eve positively wins in a reachability 
game and we use this result in section [5] when considering almost-sure winning for Biichi 
conditions. Section [6] gives matching lower bounds for the results in Sections 0] and [5j Finally 
Section [3 summarises the positive and negative results presented in the paper. 

2 Definitions 

A probability distribution over a finite set X is a mapping d : X —[0,1] such that 
d{x) = 1. In the sequel we denote by T>(X) the set of probability distributions over X. 
Given some set X and some equivalence relation ~ over X, [x].... stands for the equivalence 
class of X for ~ and X/= {[x\r^ \ x G X} denotes the set of equivalence classes of ~. As 
usual we write A* [resp. A^) for the set of finite {resp. infinite) words over some finite 
alphabet A. For A; > 0 we denote by A-^ [resp. A-^) the set of words of length at least {resp. 
at most) k. 

A concurrent arena with imperfeet information (or simply an arena) is a tuple 
A = {S, T,e, 6, ^Ei ~a) where 5 is a finite set of states-, T,e {resp. Ea) is the (finite) set 

^We say that Adam is more informed than Eve when his equivalence relation on the states of the games 
rehnes that of Eve. In particular, this is the case when Adam is perfectly informed. 
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Figure 1: A concurrent arena where Adam is perfectly informed while Eve cannot distinguish 
states si,S 2 ,'S 3 and S 4 . 


of actions for Eve (resp. Adam); 6 : S x Tie x ^(5') is the (total) transition function; 

and and are equivalence relations over states. 

A play in such an arena proceeds as follows. Eirst it starts in some initial state s. Then 
the first player, Eve, picks an action oe & Te and, simultaneously and independently, the 
second player, Adam, chooses an action oa S Ta- Then a successor state is chosen according 
to the probability distribution 6{s, ge, o'a), and the process restarts: the players choose a new 
pair of actions that induces, together with the current state, a new state and so on forever. 
Hence, a play is an infinite sequence so{a%, a]^S 2 ■ ■ ■ in (5- {Te x Syi))^^ such that 

for every i > 0, 5(sj, cj|n, cj)^)(si+i) > 0. In the sequel we refer to a prefix of a play ending by 
a state as a partial play. 

The intuitive meaning of {resp. ~^) is that two states si and S 2 such that si 
S 2 (resp. Si S 2 ) cannot be distinguished by Eve {resp. by Adam). We easily extend 
relation ~X) with X E {E,A}, to partial plays as follows. First, for any partial play A = 
so{ge, o'a)^^(.^e^ ■ ■ ■ 'Sfc denote = [so]~_^ ['Si]~x ''' then define A A' if and 

only if [A]..^_„ = [A']^^ . 

We say that Adam is more informed than Eve if ^a^^e, and Adam is perfectly 
informed if is the equality relation. 

Example 1. Consider the concurrent game with imperfect information depicted in Figure{l\ 
Let Te = Ta = {a, The initial state is sq and from sq if Adam plays the action a then 
any action played by Eve leads with probability ^ either to si or S 2 . Similarly if Adam plays b 
then any action played by Eve leads with probability ^ either to S 3 or S 4 . In the states si, S 2 , S 3 
and S 4 , which are indistinguishable by Eve, the action of Adam has no impact. If Eve plays a 
from Si or S 4 or b from S 2 or S 3 the play goes to the final state f which is a sink state. Any 
other action by Eve from one of those states leave the current state unchanged. 

In order to choose their moves the players respect strategies, and, for this, they may use 
all the information they have about what was played so far. However, if two partial plays are 
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equivalent for (resp. ~a)) then Eve {resp. Adam) cannot distinguish between them, and 
should behave the same. This leads to the following notion. 

An observation-based pure strategy (simply called strategy in the following) for Eve 
is a function ip : —>■ Eg, i.e., to choose her next action, Eve considers the sequence 

of observations she has seen so far. We overload (p by writing p{X) instead of (^([AJ^..^^): in 
particular, a strategy p for Eve is such that p{\) = p{X') whenever \ X' (and similarly 
for Adam). 

A finite-memory strategy for Eve is a strategy that can be computed by a finite au¬ 
tomaton with output that reads the observation sequence of the partial play and outputs the 
next action of Eve. We do not give a precise technical definition because it is not needed in 
this work. The size of such a strategy corresponds to the number of states of the automaton. 

Strategies for Adam are defined in a similar way by replacing 

E by 

Remark 1. In our definition of a strategy we implicitly assume that the players only observe 
the sequence of states and not the corresponding sequence of actions. While the fact that 
a player does not observe what his adversary has played is reasonable (otherwise imperfect 
information on states would make less sense) one could object that the player should observe 
the actions she has played so far. However, as the players do not use randomisation in their 
strategies, they can always retrieve the actions they played so far. 

Let A= {S,T,e, E^, 5,^e, ) be an arena, let sq £ 5' be an initial state, <^9^ be a strategy 

for Eve and pA be a strategy for Adam. First we let Outcomes{so, pE, Pa) to be the set of all 
possible plays when the game starts in sq and when Eve and Adam respectively follows pE 
and PA- More formally, a play A = SQ{a^, a\)si{a\^,a]fi) ■ ■ ■ belongs to Outcomes{sQ, pE, Pa) 
iff 5(si,v9£;([so]/...^[so]/.^^---[si]/.^^),(/9A([soj/...^[so]/...^---[si]/^^)(si+i) > 0 for every i > 0. 
Then we are interested in defining the probability of a (measurable) set of plays, knowing that 
Eve {resp. Adam) uses pE {resp. pa)- This is done in the usual way (see e.g. [5]): once a pair 
{pE, Pa) of strategies for both players is fixed, one is left with a (possibly infinite) Markov chain 
that naturally induces a probability space over the Borel u-field generated by the cones, where 
for any partial play A starting in sq the cone for A is the set cone(A) = A • ((E^; x E^^) • S)‘^ of 
all infinite plays with prefix A. We let denote the corresponding probability measure 

over this space. 

An objective for Eve is a (measurable) set O of plays: a play is won by Eve if it belongs 
to O; otherwise it is won by Adam. A concurrent game with imperfect information 
(simply called game in the following) is a triple G = {A, sq, O) where A is an arena, sq is an 
initial state and O is an objective. In the sequel we focus on the following special classes of 
w-regular objectives (note that all of them are Borel sets hence, measurable) that we define 
using a subset E C S' of final states. 

A reachability objective {resp. safety) is of the form (S- (E^; x T,a))*F{{T,e x E^i) • S)‘^ 
{resp. of the form ((S \ F) ■ (Eg x E^))^) : a play is winning if it contains {resp. does not 
contain) a final state. 

A Biichi objective {resp. co-Biichi objective) is of the form P|^>q(S'(E^xE^))-^E((E£;X 
Ea) • S)‘^ {resp. of the form (S • (Eg x Ea))*((S \ E) • (E^ x Ea))‘^) : a play is winning if it 
goes infinitely often {resp. finitely often) through final states. 

A reachability {resp. safety, Biichi, co-Biichi) game is a game equipped with a reachability 
{resp. safety, Biichi, co-Biichi) objective. In the sequel we may replace O by E when it is 
clear from the context which objective we consider. 
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Fix a game G = {A, sq, O). A strategy ipE for Eve is surely winning if, for any counter¬ 
strategy ifA for Adam, Outcomes{so, ipE,<pA) ^ O. If such a strategy exists, we say that 
Eve surely wins G. A strategy (pE for Eve is almost-surely winning [resp. positively 
winning) if, for any counter-strategy pA for Adam, = 1 {resp. > 0). If such a 

strategy exists, we say that Eve almost-surely wins {resp. positively wins) G. 

In this paper, we are interested in deciding existence of almost-surely/positively winning 
strategies for Eve for safety/reachability/Biichi/co-Biichi games. 

Example 2. Consider the (perfeet information) eoncurrent reaehability game depicted below 
with Qf as unique final state. In state q^, if both players choose the same action then they 
stay in state q^, and otherwise they move to state qf. In state qj, all choices of actions stay 
in state qf. Eve does not have any almost-surely winning strategy. 


o|o i|i *1* 



Indeed, given any strategy pE for Eve, the counter-strategy pA for Adam mirroring the 
strategy of Eve ('i.e. pA = Pe) only allows for the play q)f and hence, Prff^’‘^^{0) = 0. 
Similarly Adam does not have an almost-surely winning strategy. For any fixed strategy 
PA of Adam, any counter-strategy pE for Eve that satisfies Pe{Qw) 7 ^ Pa{(1w) is such that 

Pr‘PE,VA{Q) = 

Remark 2. The situation in Example 0 contrasts with the case of perfect information non¬ 
concurrent uj-regular games which are determined: from any state one of the players has a 
surely winning strategy (see e.g. Haul/;. In the perfect information setting (even with con¬ 
currency and stochastic transition function) there is also a determinacy result, when allowing 
randomised strategies, using the notion of values (see e.g. m for io-regular objectives or 
for a very general result). In the imperfect information setting, if one allows randomisation in 
strategies, one has, for Biichi conditions, a determinacy result (called qualitative determinacy 
in m): either Eve has an almost-surely winning strategy or Adam has a positively winning 
strategy (in Example the randomised strategy for Eve consisting in playing 0 and 1 with 
equal probability in any state is almost-surely winning). 

3 Undecidability Results 

In this section we provide undecidability results for certain combinations of types of winning 
strategies and objectives. An easy consequence of undecidability results for probabilistic u- 
automata from [T] is stated in the following theorem. In these reductions, Eve plays alone 
and cannot distinguish any states of the game. The states and transitions of the game are 
those of the w-automaton and the strategy of Eve corresponds to the input word. 

Theorem 1. The decision problems whether Eve almost-surely wins a given co-Biichi game 
or positvely wins a given Biichi game are undecidable (even if the set of actions of Adam is a 
singleton). 
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Proof. Consider a probabilistic automaton A on w-words as in [T]. Now consider a concurrent 
game with imperfect information where Adam plays no role and where Eve’s actions are the 
letters from the inpnt alphabet A oi A and whose states are the ones of the antomaton. 
Moreover all states are ~£;-eqnivalent. Now the transition function of the game mimics the 
one of the automaton. As Eve does not observe anything, a (pure) strategy tp of Eve can 
be described as an infinite word in A (the i-th letter being the i-th action played by 
Eve), and (p is almost-surely (resp. positively) winning iff the probability of a run of A over 

to be accepting is 1 {resp. strictly positive). The nndecidability resnlts follow from the 
undecidability of the emptiness problem for co-Biichi (resp. Biichi) probabilistic antomaton 
with the almost-snre {resp. positive) semantics [I]. □ 

In the following we prove that the existence of positively winning strategies for safety 
objectives is undecidable. Our result is based on the undecidability of the value 1 problem for 
probabilistic automata on finite words m- For simpler nse in our reduction we reformulate 
this problem in terms of games. 

Consider the class of concurrent reachability games G with imperfect information with 
the following properties. Eve is blind {i.e. consists of a unique equivalence class), and 
Adam has no impact on the game {i.e. his set of actions is a singleton). Furthermore, there 
is a special action j) that Eve can play at any time, and that leads (depending on the current 
state) either to a final sink state or to a non-final sink state. The final sink state is the only 
final state. Intuitively, one can think of such a game as one where Eve plays a sequence of 
actions and then declares by jj that she stops (and she wins if she stopped in a state that leads 
to the winning sink). 

We refer to this type of game as probabilistie automaton game (PA game) because they cor¬ 
respond to probabilistic automata on hnite words (see m for an introduction to probabilistic 
automata): a strategy of Eve corresponds to a finite word followed by j) (without playing () 
Eve surely loses), and the probability that it is winning is the probability of the word to be 
accepted in the automaton. Then we have the following result, which directly follows from 
the undecidability of the value 1 problem for such automata m- 

Lemma 1. For a given a PA game, it is undecidable whether Eve has for each 0 < e < 1 a 
strategy that is winning with probability (1 — e) < p < 1. 

Our reduction that uses Lemma [T] starts from an example of a concurrent safety game 
known as Hide-or-Run [9]. In this game, Adam can choose between hiding {h) and running 
(r), and Eve can choose between waiting {w) and throwing {t) her only snowball. If Adam 
hides and Eve waits, the game stays in state Shide- 
If Adam runs and Eve throws the snowball, then 
Adam is hit, and the game proceeds to sink state 
Swet- In all other cases, Adam gets home (either he 
runs without being hit or he can safely run after Eve 
has thrown her snowball) and the game proceeds to 
sink state Shome- This is a safety game where Eve wants to avoid visiting Shome- 

In [9] it is shown that Eve can only win by nsing a randomised strategy that plays action 
w in round i with probability pi such that 0 < pi < 1 for all i and WiPi > ^ (for this, Eve 
does not have to distinguish the states). 

Now the idea is to incorporate a gadget in Ghr that permits Eve to simulate random 
choices while playing deterministically. 
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Figure 2: The modified version of Hide-or-Run: Black states in Gr/Gh correspond to 

states from which led to the hnal state in G, and x denotes any letter different from jj. 


Theorem 2. It is undecidable whether Eve positively wins in a safety game (resp. co-Biichi 
game), even if consists of a single equivalence class. 

Proof. Consider a probabilistic automaton game G with a set of actions disjoint from the one 
in the game Ghr- Let G,. and Gh be two disjoint copies of G where we removed the two 
states reachable by Eve playing (J (the fj-edges are redirected as described below). 

In the game G'^^ (see Figure [2]), the concurrent choices of the actions in Ghr are simulated 
by the imperfect information. All states are indistinguishable by Eve. First Adam makes his 
choice r or h from s^ide (Eve’s action has no impact). The game then moves to the initial 
state of Gr or Gh, depending on the choice of Adam (ignore the action cheat for the moment, 
which is explained later). Because of the imperfect information Eve does not observe Adam’s 
choice. 

In Gr and Gh we removed the target states of (J but Eve can still play 1): if in G it was 
leading to the final state it now behaves as Eve playing w from Shide, and otherwise it behaves 
as Eve playing t from Shide (see Figure [2j). 

Finally, in order to prevent Eve from playing an infinite sequence of actions without (J, we 
add an extra small gadget where Adam is allowed to declare that Eve will cheat. If he plays 
cheat from Shide this leads to a new state Sc where the following may happen depending on 
the next move of Eve (the action of Adam has no impact): if she plays j) from Sc then the play 
goes to a sink state s^, (that is not final); if she does not play j) from Sc then with probability 
1/2 the play stays in Sc and with probability 1/2 the play goes to a sink final state s;. Hence, 
from Sc if she never plays (J, then the play almost-surely ends in s;. 

Let G/^^ be this new game, where we recall that all states are indistinguishable for Eve, 
Shide is the initial state and {shome, s;} are the final states. We claim that Eve positively wins 
game G/^^ iff Eve in G has an almost sure winning strategy that is not a sure winning strategy. 
Indeed, consider a strategy ip for Eve in G/^^. As Eve cannot distinguish any state in G'^^, 
and does not observe the actions played by Adam, ip is independent of Adam’s choices. 

If the strategy of Eve consists in playing jj only finitely often, it cannot be positively 
winning as it suffices for Adam to wait for the last j) and then play cheat. More precisely, the 
strategy of Adam consists in playing (in state Shide) the action h whenever Eve’s strategy will 
still play D in the future, and cheat if Eve will never play (J in the future. It can be shown that 
following this strategy Adam wins against the strategy of Eve with probability I. 
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Thus, in the following we only consider strategies of Eve that play jj infinitely often. An 
equivalent description of such strategies is by a sequence of strategies for Eve in G: 

p consists in playing an arbitrary letter then playing as pi until playing jj, then playing an 
arbitrary letter, then playing as p 2 until playing U and so on (the arbitrary letter is used here 
when Adam chooses to move to G^, Gh or Sc). 

For one direction, assume that p is positively winning in G'jj^. Let pi be the probability 
that Eve wins in G when playing according to pi. Then, from the properties of Ghr, it follows 
that p is winning iff 0 < pi < 1 for all i > 1 and W^Pi > 0. This implies that the sequence 
converges to 1 and hence the pi are strategies as in Lemma [TJ 
Conversely, if Eve has strategies winning with probabilities arbitrarily close to 1 as in 

Lemma [U then one can choose the pi such that 1 > p* > 1 — ^ which ensures 0 < p, < 1 

(i + l)^ 


for all i > 1 and W^Pi > 0. Indeed, 
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This family pi defines a strategy for Eve in G'^^. Again using the properties of Ghr, this 
implies that Eve positively wins against all strategies of Adam: either no outcome ever reaches 
Sc, in which case Ghr is simulated, or if an outcome reaches Sc, then it does with positive 
probability, and then it also reaches Sw with positive probability. 

□ 


4 Positive Winning in Reachability Games 

We now address the decidability of whether Eve positively wins in a reachability game, and 
we show decidability (and matching lower bounds) for the case where (i) Adam is perfectly 
informed and (ii) Adam is more informed than Eve. 

For the rest of this section fix an arena A = (5, T,e, Ea, S, ^e, ^a) and a set of final states 
FCS. 

To later address almost-sure winning (Section [5]) we need to consider games that may start 
in different states, and we are interested in strategies that are winning from all of these states. 
For this reason, we define for any subset K of states a game {A, K, O) that is played as follows: 
there is a new initial step where Adam picks a state sq in K and then the play proceeds as 
in {A,sq,0). Hence, a strategy p for Eve in such a game is almost-surely {resp. positively) 
winning iff p is almost-surely {resp. positively) winning in {A,so,0) for every state sq € K. 

4.1 Winning in a Finite Number of Moves. 

We start with a general result that does not depend on how the players are informed. It states 
that if Eve can positively win in a reachability game then she can do so within a bounded 
number of moves. 

Proposition 1. Let K C S be a subset of states and assume that Eve has a positively winning 
strategy p in the reachability game {A, K, E). Then, there is a bound N and some 0 < < 1 

such that whenever Eve respects p in game {A,K,F), the probability that the resulting play 
visits a final state within the N first moves is at least sk ■ 


9 






Proof. For any > 0, any s € K and any strategy i/jn for Adam, call the probability 

of the event ”a play in {A,s,F), where Eve respects ip and Adam respects 'ipN visits a final 
state within the N first moves”. 

Let = min{pj^^’^ | s G A}. We aim to show that there exists some N > 0 such that 
for each strategy fijsf for Adam, x'ffi > 0. 

For this, we reason by contradiction, assuming that for any bound N > 0, Adam has 
a counter strategy ipN such that xffi = 0. In particular, there is a state s ^ K such that 
_ Q £qj. infinitely many N. Hence, we can assume that the are such that pffi’^ = 0 
for all A > 0 (as to get the property for some N Adam can always use the strategy for some 
N' > N). 

Using (V’Ar)7v>o we define a strategy for Adam as follows. We first let Iq = N be the set 
of naturals. Next we define and {Ik)k>o^ a decreasing sequence (for inclusion) of infinite 
subsets of the naturals. First we sort partial plays by increasing length. We assume that ip 
is defined on all partial plays of length smaller than k (hence initialization for k = 0 comes 
for free) and for plays of length /c + 1 we do the following. There are finitely many plays of 
length k + 1 while Ik is infinite: hence there is an inhnite subset Ik+i C Ik such that all the 
ipj for j G Ik+i agree on plays of length k + 1 and we define ip to behave accordingly. 

The following is a direct consequence of the definition of ip and {Ik)k>o- for all k > 0, Ik 
is infinite and for all j £ Ik, both ip and ipj agree on any partial play of length smaller than 
k. 

In particular it implies that x^ = 0 for all N > 0: indeed, x^ = 0 for any M > N 
and Ip agrees with all ipM with M £ I^ (and as /tv is infinite such an M exists). Finally, as 
0 < Prf’'^{0) < Y1 n>o^% ~ (here O denotes the reachability objective defined by F), we 
conclude that Prf’'^{0) = 0 which contradicts our initial assumption of p being positively 
winning in (A, s,F"). 

The fact that there is some e > 0 such that p ensures to reach a hnal state in less than N 
moves with a probability greater than e is a direct consequence of the fact that one bounds 
the number of moves by N. □ 

Remark 3. A simple consequence of Proposition^ is that finite memory suffices for Eve to 
positively win in a reachability game. Indeed, it suffices to follow p for the N first moves and 
then play the same action forever. 

Remark 4. An important consequence of the proof of Proposition [I] is that the values of the 
probabilities do not have any influence on whether Eve positively wins in a reachability game. 
More precisely consider another arena A' that is exactly as A except that its transition function 
6' is such that for all states s and any pair of actions {aE,crA) one has (5(s, ue, uyi) = 0 iff 
S'(s, aE, o'a) = 0. Then Eve positively wins in the reachability game (A, K, F) iff she positively 
wins in the reachability game {A! ,K,F). 

4.2 Positively Winning When Adam Is Perfectly Informed 

We now assume that Adam is perfectly informed. 

Consider for all n > 0, the objective Reach-"'(F) = {S ■ {T,e x S^))^”F((Se x T,a) ■ -5)^ 
where a final state has to be visited within the first n steps. The following inductively 
characterises the sets K for which Eve can win (A, A, Reach-"'(F)). 
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Proposition 2. Let K C S be a set of pairwise ^E-^Quivalent states and let n > 0. Eve 
positively wins {A, K,Keach.-"'{F)) if and only if there exists an action ue € 'Ee and a set 
K' S such that 

• Eve positively wins {A, K',Keach-"'~^ (F)), 

• for all s G K\F and for all a a G there exists s' G K' such that 5{s,aE,crA){s') > 0. 

Proof. Let (/? be a positively winning T-compatible strategy in G. Now use in G: obviously 
it is still T -compatible and we have to prove that it is positively winning. Consider a strategy 
if' of Adam in G^ Then, assuming Eve respects ip, strategy ip' can be mimicked in game G: 
indeed, Adam simply has to update a state H in A! which is done by computing Up{H, ge, o'a) 
and observing the equivalence class for ^a relation; assuming Eve respects ip it means that 
Adam always know what action aE she will play and therefore can compute Up{H,aE,o'A)- 
Call Ip the strategy in G mimicking ip'. 

Now let N be some integer and consider all those partial plays of length A^ in G where 
Eve respects ip and Adam respects ip. Group all ~y 4 -equivalent such partial play: then for 
every class consider the set H of possible last states. Then those such H are exactly those 
states that can be reached in G' in a partial play of length N when Eve respects ip and Adam 
respects ip'. As ip is positively winning in G , thanks to Proposition [T] there is some N such 
that Eve positively wins within the N first moves and therefore for the same N we conclude 
that Eve positively wins within the N first moves in G^ using ip against ip' . As this property 
does not depend on ip' we conclude that ip is positively winning in G'. 

Conversely, assume she has a positively winning T-compatible strategy in G^ Now use 
ip in G: obviously it is still T-compatible and we have to prove that it is positively winning. 
By contradiction, assume Adam has a strategy ip that ensures, provided Eve uses ip in G, 
that no final state is reached. Then, from ip one can define a strategy in ip' that consists 
in a partial play HQ{a^,aA)Hi{oEjOA) ■ ■ ■ Hk to play action V'([so]~a ''' where Sj is 

any (they are all ~A-equivalent) element in Hi for all i. Using the same argument as in the 
direct implication relating plays in G when using strategies {ip, ip) and plays in G' when using 
strategies {ip', ip'), one concludes that playing ip' against ip in G' ensures that no final state is 
visited hence, leading a contradiction with ip being positively winning in G'. 

□ 


Now, consider the increasing family of sets {Wi)i>o defined by: 

• Wo = {K \ K CF} 

• Wj+i = {K C S \ \/r € K, 3aE 3K' G W* s.t. Vs with sG iV \ T, VcrA 
3s' G K' s.t. 6{s,aE,o'A){s') > 0} 

and call W its limit. Then the following is a simple consequence of Proposition [2] 

Theorem 3. Let K C S be a non-empty set. Eve has a positively winning strategy ip in the 
game {A,K,F) if and only if K ^W. In particular it can be decided in time exponential in 
|5| whether Eve has a positively winning strategy.If such a strategy exists, one can construct 
one that uses the set 2^ as memory, and this strategy guarantees to positively reach a final 
state within the first moves. 
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Proof. Using Proposition [2l by a direct induction on n one gets that K 7 ^ 0 belongs to if 
and only if Eve postively wins K, Reach-’^(F)). 

For any K € W, we denote by rk(ii') the smallest n such that K G Wn- Now, for any 
K € W, we define a strategy for Eve denoted ipx that uses W as a finite memory. Initially 
the memory is K. For a partial play A ending in a state in some equivalence class and 

assuming that the memory is K', we define the strategy as follows: 

• If r:k{K') > 0 and if there exists r € CiK, then by definition of (Wi)i>o there exists 
some action cte and some set K” such that the following holds: 

- ik{K'') = ik{K') - 1, 

— Mr' T with r' e K \ F,\/aA, 3s' G K", s.t. 6{s, ge, c’'a)('S') > 0. 

Then we let ¥?a'(A) = ue and update the memory to K". 

• In all other cases, we take to be an arbitrary action and update the memory to 

0 . 


By induction on n, we show that for all non-empty K G Wn, the strategy ipx is positively 
winning in {A, K, Reach.-''(F)). The base case is immediate. Assume that the property is 
established for n —1 > 0. Let iP be a non-empty element of Wn- Let sq £ K, aE = ([■so]~_b) 

and K' G Wn-i be the memory of (px after the first move. Let V’ be a strategy for Adam. Let 
aA be the first action played by Adam when using ip and let tp' be the strategy followed by 
Adam after this first step, i.e. ip'{\) = ^/^(so • {aE, crA)A) for all partial play A. By definition 
of K', there exists s' G K' such that S{so, ue, o'y 4 )(s') > 0. Hence, we have: 

Pr^^^’'^(Reach^"(F)) > 5{so,aE,(rA){s') ■ Pr'''/'{Reach^^-^F)) > 0. 
which concludes the proof. □ 

The following is a restatement of the end of Theorem [3l 

Corollary 1. In Proposition^ when Adam is perfectly informed, one can always choose p 
such that N < . 

4.3 Automaton-Compatible Strategies 

The aim of this section is to refine Theorem [3] to positively winning strategies that satisfy 
further constraints. The motivation is that in Section [5] we compute almost-sure winning 
strategies for Biichi conditions using a hxpoint computation. In one iteration of this computa¬ 
tion, we compute positively winning strategies for reachability that satisfy an extra constraint 
(roughly, that Eve can positively win the reachability game while ensuring that she can win 
another round of the reachability game once the target set is reached). This further constraint 
is expressible by finite automata that read partial plays and restrict the set of admissible next 
actions for Eve. Thus, below we develop the notion of a strategy that is compatible with such 
an automaton and then later apply it to the specihc setting that we need. 

Let T = {Q,Tje X S/^^,qo,qs,6E, Act) be a deterministic hnite automaton with input 
alphabet Sg x a finite set of states Q, an initial state qQ, a sink state qg, a transition 

function 6^ ■ Q x {P‘E ^ ^ ^ function Act : Q —>■ 2^^ associating with any state 

of T a subset of actions for Eve. Moreover, we require that the following holds 
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• Act{q) = 0 if and only q = qg. 

• For all states q and for all (a, x) G x one has 5 t(q') = qs if and only if 

a ^ Act{q). 

Such a machine associates with any partial play A a unique state q\ defined by qg^ = qo and 
Qx-(aE,iTA)-s — ^TiQXy M-b)); it also permits to associate with any partial play a subset of 
actions by letting Act-j-{X) = Act{qx). 

A strategy (p of Eve is 'T- compatible if for any partial play A where Eve respects (p one 
has p>{\) G Actq-^X). Note that it implies that qx Qs- 

Remark 5. In case T consists of an initial state and a non reachable sink state (\.e. all 
transitions from the initial state goes back to it) and Act equals all actions Tie in the initial 
state, one has that any strategy is T -compatible. Hence, any result we obtain later will also 
hold if we drop the T-compatibility constraint. 

In Section [5] and for the proof of Theorem HI we work with automata that compute the 
knowledge of Eve along a play, as explained below. Eor an initial knowledge set Kq C S' of 
pairwise ~£;-equivalent states, the knowledge (also known as belief) Know^° {X) of Eve after 
a partial play A starting in a state of Kq, intuitively corresponds to the set of possible states 
that can have been reached in a play ~E-equivalent to A. 

Eormally, the value of Know^°{X) can be inductively dehned as follows: Know^°{so) = 
Kq and Know^°{X ■ {(Te,<^a) ■ s) = UpKnowE{Know^°{\),aE,[s\r.JE) where the function 
UpKnowE : 2^ X Te X [5']/.^^ —>■ 2“^ is defined by: 

UpKnowE{K,(TE, [s]~b) = {t ^ [s]~B | 3r G iF, Bcta G Ta s.t. 6{r,aE,crA){t) > 0}. 



(a) Arena of Remark [6] 

Eigure 3: Arenas and knowledges 



Remark 6. The knowledge is in general smaller than the currently observed equivalence 
class. For instance, consider the reachability game depicted in Figure EH in which all states 
are equivalent. If the strategy of Eve is to play {abb)‘^, then her observation is always the same 
(as all states are equivalent). Her initial knowledge is {(? 0 ) 9 i} but after playing a it becomes 
{qi} and after a b it becomes {g'o} nnd after another b it becomes {q'Oj'Zi}- 

Remark 7. Given a family IC 2^ of knowledges for Eve (in the sense that each K ^ K. 
is a subset of a ^E-class), one can construct an automaton Ttc such that the Tk.- compatible 
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strategies are precisely those such that Eve’s knowledge always remains inside 1C. The states 
of Tic are the elements of 1C, the transition function is defined by UpKnow^, and the actions 
Act{K) enabled at a state K are those that ensure that the knowledge remains inside 1C. 

Remark 8. In \^, it is shown that if Eve can almost-surely win (using randomised 
strategies) a Biichi game, she can do so using a strategy tp that only depends on the knowledge, 
i.e. (/9(A) = (/9(A') whenever Know^{X) = Know^{\'). However, even if Eve is playing alone, 
this is no longer tru^ (even for reachability games) in our setting where we restrict to pure 
(^i.e. non-randomised) strategies. Consider the reachability game in Figure where Eve is 
playing alone. The equivalence relation is given by si S 2 and t 2 /i / 2 - 

If the game starts in sq then whatever strategy Eve uses, her knowledge always coincides 
with her observation. Eve can surely win (she can simply play the sequence aaab). But if her 
strategy only depends on her knowledge then she necessarily plays a sequence of actions of the 
form xu^ where x G {a, 6} and u is a two-letter word, and thus she has a probability ^ to win 
using such a strategy. 

We now return to the strengthening of Theorem [3l We assume that Adam is perfectly 
informed and we fix an automaton T = {Q,'Ee x S/^^,qo,qs,6'E, Act) as in Section 031 We 
are interested in checking whether Eve has a T-compatible strategy that is positively winning 
in the reachability game {A,K,F). 

Our main result is the following and its proof is by two successive reductions and an 
application of Theorem [3l 

Theorem 4. One can decide in time polynomial in and polynomial in \Q\ whether Eve 
has a T-compatible strategy that is positively winning in the reachability game {A,K,F). If 
such a strategy exists, one can construct one that uses memory of size polynomial in jQI and 
exponential in |5|. 

Proof. Note that adding the condition on the strategy being T-compatible somehow means 
that once a final state is reached the play is not yet won by Eve because she needs to keep 
playing in accordance with T {i.e. she must avoid to produce a partial play A with qx = qg). 
Hence, it is natural to consider an enriched arena Aj- that embeds T. For this let Ae = 
{S X Q,T,E,'EA,d',^E,^A) where 

• 6'{{s,q),aE,(rA){s',q') equals 6{s,aE,o'A){s') if q' = d^iq, {crE,[s']r..,^)) and otherwise it 
equals 0; 

• (s 2 , 92 ) if and only if si S 2 and qi = q 2 ', and 

• tha is the equality relation, i.e. Adam is perfectly informed. 

Of special interest is the safety game {At,K x {q'oIj'S' x {( 7 s}) and we are interested in 
sure winning for Eve because of the following straightforward lemma 

Lemma 2. Eve has a (possibly loosing) T-compatible strategy in the reachability game {A, K, F) 
if and only if she has a surely winning strategy in the safety game {Ae,K x {( 70 }, S' x {( 7 s}). 

^This fact is also observed in [B]. 


14 



It is a known result [3] that when one considers sure winning for Eve in a safety game, 
winning strategies only depend on the knowledge of Eve (in the sense of Section 14.31) . More 
precisely consider the (unique) largest subset K, of knowledges and the (unique) mapping 
Aut : K, —7> 2^^ such that the following holds. 

• No knowledge K £ }C contains a forbidden state. 

• Eor every K £ 1C, the set Aut{K) which consists of all those actions cte £ Aut{K) such 
that for every action a a € T,a one has UpKnowE{K, aE, [s]~_b) £ ICU {0}, is not empty; 
i.e. actions in Aut{K) are those that ensure that the updated knowledge will still be in 
1C regardless of the action of Adam. 

Then Eve surely wins the safety game from configurations where her knowledge AT is in /C 
and a strategy consists in choosing any action in Aut{K). 

Note that in the safety game {At,K x x {( 7 s}), Eve’s knowledges are elements in 

2“^ X Q (as we have that (si,gi) wg ( 52 ,^ 2 ) implies qi = q 2 ). 

Now consider an automaton T' = {Q',Tje x Si^^,qQ,q'g,5'fi,Act') that computes Eve’s 
knowledge (as explained in Remark [71 Hence, T' is the same as Tk.) in the previous safety 
game and uses function Aut = Act' to define those authorised actions. To fit the definition, 
merge all knowledges not in /C in a sink state and define Aut to be equal to 0 on it. The 
states Q' of T' are elements of 1C (plus the sink state) and take as initial state q'^ = K x {go} 
(which possibly is the sink state). In particular the number of states of T' is exponential in 
|5| and linear in |(5|. 

Now one can go back to the original arena and consider the enriched arena Ap'. Then we 
have the following easy lemma. 

Lemma 3. Eve has a T-compatible positively winning strategy in the reachability game 
{A, K, F) if and only if she has a positively winning strategy in the reachability game {Ap ', K x 
{ga,i^x(g'\{g(})). 

Moreover, from a positively winning strategy in the second game using memory of size 
N one can effectively construct a T-compatible positively winning strategy in the reachability 
game {A,K,F) that uses a memory of size 0{N x 2l'^l x |g|). 

Proof If Eve positively wins in {Ap',K x {q'Q},F x {Q' \ {g(})) then we can safely assume 
that she necessarily always plays authorised (according to Act') actions (otherwise the play 
goes directly to S' x {g(} and gets trap in it forever, hence cannot reach F x {Q' \ {g(}), hence 
is T-compatible thanks to Lemma [2l Such a strategy can be mimicked in the original game 
and it requires to simulate automaton T' hence, costs an extra memory of size the one of T'. 
Conversely, if it she has a positively winning T-compatible strategy in the original game, the 
same strategy can be mimicked in the reduced game and is still positively winning. □ 

Now combining Lemma [3| together with Theorem [3| concludes the proof of Theorem [H □ 

4.4 The Case Where Adam Is More Informed Than Eve 

We now assume that Adam is more informed than Eve and we fix an automaton T = {Q, x 
S/^^,qo,qs,6T,Act) as in Section [T3l Again, we are interested in checking whether Eve has 
a T-compatible strategy that is positively winning in the reachability game {A,K,F). The 
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idea here is to reduce this question to one on a game where Adam is perfectly informed and 
therefore conclude thanks to Theorem SI 

For this let % be all those subsets of S that consist of pairwise ~yi-equivalent states. 
For such a subset H and for any pair of actions {(Te^cfa) S x define the set 
Up{H,aE,CTA) G as follows. First, define M = {s' & S \ 3s £ H s.t. 6{s,aE,crA){s') > 0)} 
as the set of all possible successors of states in H when playing the pair of actions {o'e,o'a) 
and let Up{H,aE,crA) consist of all those non-empty subsets H' that can be written as 
H' = M n i.e. all possible indistinguishable (for Adam) subsets of M. 

Define now a new arena T^e, '^a-, ~a) by letting 

• 5'{H,aE,crA){H') = l/\Up{H,aE,(yA)\ if H' £ Up{H,aE,<yA) and 0 otherwise; 

• Hi KiE H 2 if Si ^E S 2 for all si £ Hi and S 2 £ H 2 ] and 

• tha is the equality relation, i.e. Adam is perfectly informed. 

Define the set of final states F' as those elements H ixiH such that H F 7 ^ 0. 

Note that the equivalence classes of can be identified with the equivalence classes of 
r^E (because ^aF^e) and therefore one can define T-compatible strategies for Eve also in a 
play in A!. More generally, any Eve’s strategy in one game can be used in the other one. 

For a set AT C 5 define v{K) £ H as y{K) = {{s} | s £ K{. The following proposition 
relates game {A,K,F) and game {A'F'). 

Proposition 3. An Eve’s strategy is a positively winning F-compatible strategy in G = 
{A, K, F) if and only if it is a positively winning T -compatible strategy in G' = {A!, v{K),F'). 

Proof. Let phe a positively winning T-compatible strategy in G. Now use p in G; obviously 
it is still T-compatible and we have to prove that it is positively winning. Consider a strategy 
if' of Adam in G^ Then, assuming Eve respects p, strategy if' can be mimicked in game G: 
indeed, Adam simply has to update a state H in A! which is done by computing Up{H, ge, o'a) 
and observing the equivalence class for relation; assuming Eve respects p it means that 
Adam always know what action aE she will play and therefore can compute Up{H,aE,o'A)- 
Call if the strategy in G mimicking if'. 

Now let N be some integer and consider all those partial plays of length in G where 
Eve respects p and Adam respects if. Group all ~y 4 -equivalent such partial play: then for 
every class consider the set H of possible last states. Then those such H are exactly those 
states that can be reached in G' in a partial play of length N when Eve respects p and Adam 
respects if'. As p is positively winning in G , thanks to Proposition [T] there is some N such 
that Eve positively wins within the N first moves and therefore for the same N we conclude 
that Eve positively wins within the N first moves in G^ using p against if' . As this property 
does not depend on if' we conclude that p is positively winning in G'. 

Conversely, assume she has a positively winning T-compatible strategy in G'. Now use 
(p in G: obviously it is still T-compatible and we have to prove that it is positively winning. 
By contradiction, assume Adam has a strategy if that ensures, provided Eve uses p in G, 
that no final state is reached. Then, from if one can define a strategy in if' that consists 
in a partial play HQ{a'^,a'^)Hi{a^,a\) ■ ■ ■ H^ to play action V’([so]~a ''' ['5A:]~a) where Sj is 
any (they are all ~A-equivalent) element in Hi for all i. Using the same argument as in the 
direct implication relating plays in G when using strategies {p, if) and plays in G' when using 
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strategies {ip', ip'), one concludes that playing ip' against ip in G' ensures that no final state is 
visited hence, leading a contradiction with ip being positively winning in Gh □ 

Combining Proposition [3] with Theorem 0] directly leads the following result. 

Theorem 5. One can decide in time polynomial in and polynomial in \Q\ whether Eve 
has a T-compatible strategy that is positively winning in the reachability game {A,K,F). If 
such a strategy exists, one can construct one that uses memory of size polynomial in |Q| and 
doubly exponential in IS"!. 

5 Almost-Surely Winning for Biichi Conditions 

For the rest of this section fix an arena A = {S, T^e, 6, ~e, ~a) and a set of final states 
F C S. We are interested in almost-sure winning strategies, and we focus on Biichi conditions, 
as a solution for this case permits to obtain a solution for reachability condition by a simple 
reduction (change the arena so that whenever a final state is reached then the play stays in 
it forever). For the moment we do not make any assumption on how Adam is informed. 

We show how to compute the set of almost-surely winning knowledges of Eve, denoted 
, which is the set of subsets K C S such that K C for some s € S and for which 

Eve has an almost-surely winning strategy in the Biichi game Gk = (-4., K, F). 

5.1 Fixpoint Characterisation 

Theorem [ 6 ] below states that the set can be expressed as the greatest fix-point of a 
(monotone) mapping E : 2^^ —>■ 2^^ defined as follows. Let /C C 2"^ and let iF € /C. We say 
that K belongs to E(/C) if Eve has a strategy in the reaehability game {A,K,F) which is 
positively winning and guarantees that her knowledge always stays in JC. 

Theorem 6 . is the greatest fixpoint o/H. 

Proof. For a subset JC of knowledges, say that an Eve’s knowledge K is JC-good if she has a 
strategy in the reaehability game {A, K,F) which is positively winning and guarantees that 
her knowledge always stays in JC. 

We first argue that JC"^^ is a fixpoint for E. For this we consider any K G JC^^ and prove 
that it is /C'^'^-good. We denote by Gk the Biichi game {A, K, F) and we start with a simple 
lemma. 

Lemma 4. Let K G JC^^. Let p) be any strategy for Eve that is almost-surely winning for 
her in Gk and let oe = </^([1^]~b)- Then, for any oa £ 'Fa, for any t such that 3s G K with 
5{s, aE, o'A){t) > 0, UpKnowE{K, ge, M-^) G JC^^. 

Proof. Consider some action a a and some t such that 6{s,aE,crA)it) > 0 and let K' = 
UpKnowE{K,aE, [^]~_b)- By definition of UpKnowE, for all t' G K', there is some s' G K 
and some action such that 6{s',aE,cr'^){t') > 0. Now, define the strategy p' of Eve by 
letting p'{X) = <^([s]~B • A) for any partial play A. We claim that p' is almost-surely winning 
for Eve in Gt' for any t' G K', hence implying that K' G IC^^ . By contradiction, assume that 
p' is not almost-surely winning for some Gf with t' G K' and let ^p' be a counter-strategy for 
Adam in Gf, i.e. Pr^, {O) < 1 (recall that O denotes here the Biichi objective). Now, pick 
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s' ^ K such that 6{s',aE,cr'^){t') > 0 and define a strategy ij: of Adam by letting 'ip{s') = cr^ 
and ■ A) = ip'iX). Then as Prf, (O) < 1 one also has that Pr‘^,’'^{0) < 1 which leads to 
a contradiction. □ 

Fix a strategy (px as in Lemma 01 a play A in Gk where Eve respects (px is such that 
Know^{X) G . Moreover, as (p is almost-surely winning for the Biichi game Gx, it is in 
particular positively winning in the reachability game {A,K,F). Hence, using Proposition [H 
one gets a bound Nx and some ex, meaning that the probability of a play A in Gx where Eve 
respects (px to visit a final state within its first Nx moves is > Hence, K is /C^'^-good, 
implying that is a fixpoint for S. 

Now we show that any fixpoint of H is included in . For this assume that E(/C) = K. 
for some K,. As any iL G /C is /C-good it comes with some ipx, Nx and ex- We let N = 
max {Nx \ K G /C} and e = min {ex \ K G /C}. 

Now we define a strategy p that consists in playing in rounds of length N: at the beginning 
of some round. Eve considers her current knowledge H and plays according to pn in the next 
N moves; then she restarts with the updated knowledge, and so on forever. 

Now consider some K ^ K,. We claim that p is almost-surely winning for Eve in any in Gx- 
Indeed, from the properties of the px, it follows that any play in Gx where Eve respects p is 
such that the knowledge is in JC. Now, as the px ensure to visit a final state with probability 
> e in less than N moves the Borel-Cantelli Lemma implies that p is almost-surely winning. 
Hence, K G and this concludes the proof. □ 

5.2 Decidability Issues 

As H is monotone for set inclusion, it suffices to compute by successive applications 
(starting with the set of all subsets) of the operator S until reaching the fixpoint. Since 
jqAS ^ 2 'S’^ the fixpoint is reached in at most steps. 

Now, as noted in Remark [7] the property for a strategy to guarantee that Eve’s knowledge 
remains in a set /C can be expressed as the strategy being Tic-compatible (and the number of 
states of Tic is at most exponential in |S|). Therefore, thanks to Theorem 0] (res;). Theorem[5]) 
every step in the fixpoint computation can be achieved in time exponential (resp. doubly 
exponential) in |5| if Adam is perfectly informed {resp. more informed than Eve). 

Theorem 7. Let G he a Biichi (or reachability) game with n states. 

• If Adam is perfectly informed, one can decide whether Eve has an almost-surely win¬ 
ning strategy in time exponential in n. If such a strategy exists, it can he effectively 
constructed and requires memory at most exponential in n. 

• If Adam is more informed than Eve, one can decide whether Eve has an almost-surely 
winning strategy in time doubly exponential in n. If such a strategy exists, it can be 
effectively constructed and requires memory at most doubly exponential in n. 

Proof. Decidability follows from Theorem0[iTheorem[5]and the fixpoint characterisation given 
in Theorem [H The result on the strategies is also a consequence of Theorem 0j/Theorem [5] 
combined with Corollary0]which permits to bound the size of N in the proof of Theorem[6l □ 
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6 Lower Bounds 


We now give a matching lower bound to the upper bounds in Theorem [5] and in Theorem [7] 
for the case where Adam is more informed than Eve. Note that in the case where Adam is 
perfectly informed one can get a matching lower bound (ExpTime-hardness) as in the case 
where randomised strategies are allowed [7]. 

Theorem 8. Deciding whether Eve has a positively winning ( resp. an almost-surely winning) 
strategy in a reaehability game where Adam is more informed than her is a 2-Exp Time-hard 
problem. 

Proof. The idea is to simulate a computation of an alternating Turing machine that uses a 
space of exponential size and to reduce termination to almost-surely winning for Eve. As 
alternating Turing machines of exponential space are equivalent to deterministic Turing ma¬ 
chines working in doubly exponential time it permits to obtain the desired lower bound. We 
can safely assume that initially the input tape is made of n distinguished symbols followed by 
2"' — n blank symbols. A configuration of the machine can be described by a word of length 2"' 
in A*QA* where A is the tape alphabet (including a blank symbol) and Q is the set of states of 
the machine (including some final states): the meaning of a configuration oi ■ ■ ■ aiqai+i ■ ■ ■ 02 ^ 
is that the tape content is ai ■ ■ ■ 0 ^ 02 ", the state is q and the reading/writing head is on the 
£-th cell. A run of the machine is a sequence of successive configurations separated by transi¬ 
tions of the machine; it is accepting if it contains a final configuration (and in that case the 
run is of finite length; otherwise it is of infinite length). 

A classical way of thinking of an alternating Turing Machine is as a game where Eve is 
in charge of the choice of transitions when the machine is in an existential state while Adam 
takes care of the universal states. The machine accepts if and only if Eve has a winning 
strategy to eventually reach a configuration with a final control state. 

Consider now the following (informal) game. Eve is in charge of describing the run of the 
Turing machine (her actions’ alphabet contains all the necessary symbols for that i.e. AijQ 
that permits the game to go in some associated states). After she described a configuration 
either she (in case the state is existential) or Adam (in case the state is universal) describes a 
valid transition of the machine (again by playing some special actions), and then Eve describes 
the successive configuration and so on until possibly a final configuration is reached (in which 
case she wins the game). 

Of course the problem is that Eve could cheat and do not describe a valid run. Eor this, 
Adam can, in every configuration, secretly (i.e. Eve does not observe it) mark a cell of the 
tape, and in the next configuration he can indicate a cell (supposedly of same index than 
the previously marked one) and it is checked whether there is a wrong updating of it: this is 
easily done as the cell before and after the marked cell have been stored in the arena (and Eve 
does not observe it of course) and together with the transition one can compute the correct 
update of the cell. Now in case there is effectively a wrong update of the cell content, the 
play restarts (i.e. the players restart from the initial configuration of the Turing Machine); 
otherwise the play goes to a final state and Eve wins. 

One problem in the previous simulation is that Adam could cheat by indicating two cells 
that are not with the same index. If the space used by the machine was of polynomial linear 
size, one could of course store the actual index and formally check it. Here, we use an extra 
coding to circumvent this problem. When describing the configuration, after every symbol 
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Eve produces a sequence of n bits whose meaning is to describe, in binary counting, the index 
of the last symbol. When she describes such a binary number, Adam can secretly mark a bit 
that he claims will be not correctly updated when describing the index of the next symbol (for 
this he just plays an action that stands for a number between 1 and n) and this is check next: 
if she made an incorrect update, the play restarts {i.e. the players restart from the initial 
configuration of the Turing Machine); otherwise the play goes to a final state where she wins. 
One also uses this binary encoding of the index of the cell in the following way: whenever 
Adam marks a symbol that he claims will be incorrectly updated in the next configuration, 
a bit of its binary encoding is guessed (i.e. randomly chosen) and its index is stored and not 
observed by none of the players. Later, when Adam indicates the supposed corresponding 
symbol in the next configuration, the guessed bit is checked and should match: if not the play 
goes to a final state and Eve wins; otherwise one does as previously explained (i.e. one checks 
whether the symbol is correct: if not the play restarts otherwise the play goes to a final state 
and Eve wins). 

We claim that Eve positively wins (equivalently almost-surely wins) this game if and only 
if the Turing Machine accepts. Once this is established the proof will be over as one can easily 
notice that the previous informal game can be encoded formally as a two-player game with 
imperfect information of polynomial size in the one of the Turing Machine. 

Assume first that the Turing Machine accepts. Hence, it means that the existential player 
Eve has a winning strategy in the acceptance game of the machine. Now, mimic this strategy 
in the above described game: Eve always make a correct description of a run and when 
she has to choose a transition of the machine she does as in her winning strategy in the 
acceptance game of the machine. We claim that this strategy is almost-surely winning (hence, 
also positively winning). Indeed, any strategy of Adam that does not infinitely often claim 
that a cell is incorrectly updated is surely loosing for him because either he makes a wrong 
claim (actually his claims are always wrong but here we mean he get discovered because of 
the hidden bit) or after some point the simulation goes to the end and finishes by a final 
configuration of the Turing Machine. Now against this strategy of Eve, when Adam infinitely 
often claims that a cell is incorrectly updated, he almost-surely gets caught because at every 
claim there is a (fixed positive) probability (at least 1/n) that the secret bit does not match, 
and by Borel-Cantelli Lemma, the probability that he gets caught eventually is therefore 1. 
Of course, if Adam claims at some point that a bit is incorrectly updated by Eve he also 
looses (because she describes a valid run). Hence, Eve’s strategy defeat any strategy of Adam 
almost-surely. 

Conversely assume that the Turing Machine does not accept. Hence, it means that the 
existential player Eve has no winning strategy in the acceptance game of the machine. Now 
consider a strategy of Eve. There are two possibilities. 

• Either there is a strateg 5 Hof Adam against which Eve’s strategy eventually cheats. Then 
consider the strategy of Adam that plays the same except that he points the moment 
where she cheats: then Eve must behave the same and therefore the play restarts. Now, 
consider how she behaves in the restarted play and do the same reasoning. If we are 
always in the same situation, by iteratively playing a strategy pointing where she cheats 

®In fact a set of indistinguishable strategies from Eve’s point of view, including the ones where Adam 
claims she cheats. 
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in every simulation of the Turing Machine ensures that no final configuration is reached 
and therefore that she surely looses. 

• Or, against any strategies of Adam, Eve’s strategy never cheats (i.e. describes a valid 
run). Hence Eve’s strategy can be seen as a strategy in the acceptance game of the 
machine and therefore one can consider the strategy of the universal player that beats 
it in the acceptance game and let Adam mimics it in the simulation game (and he never 
claims that she cheats). Then this strategy leads an infinite play that corresponds to 
the description of an infinite run of the alternating Turing Machine that never visits a 
final conhguration: hence it surely defeats Eve’s strategy 

Hence, for any strategy of Eve in the above described game there is a strategy of Adam 
that surely beats this strategy, which implies that there is no positively winning (hence almost- 
surely winning) strategy for Eve. This concludes the proof. □ 

7 Summary 

The landscape of decidability and undecidability results with pointers to the literature and 
to the results in our paper are shown in the Tabled! The entries of the form “1/2-Exptime- 
comp.” refer to the two cases of Adam being perfectly informed and being better informed 
than Eve, respectively (the result from [6] is for the case of Adam being perfectly informed). 
The implication means that our result is an easy consequence of a result from the literature. 
The undecidability results already hold for the case in which Adam is perfectly informed. 



Safety 

Reachability 

Biichi 

CO- Biichi 

Positively 

Undecidable 

Th.[2] 

1 / 2-Exptime-comp. 

[6], Th.[3|/Th.[5]-b Th.[8] 

Undecidable 

[I] ^ Th. [T] 

Undecidable 

Th.[2] 

Almost 

Sure 

ExpTime-comp. 

[3] 

1 / 2-Exptime-comp. 

[6], Th.[7|-b Th.E 

1 / 2-Exptime-comp. 
Th.|7]-t Th.[8] 

Undecidable 

[n Th.[T] 


Table 1: Landscape of decidability and undecidability results 
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